King County
I tested Copilot Vision for Windows. Its AI eyes need better glasses
The whole point of Microsoft Copilot Vision for Windows is that it's like an AI assistant, looking over your shoulder as you struggle through a task and making suggestions. So, I was pretty convinced that if Microsoft were to release Copilot Vision for testing, it would be able to do something simple like help me play Windows Solitaire. Sometimes, Microsoft's new Copilot Vision for Windows feels like a real step forward for useful AI: this emerging Windows technology sees what you see on your screen, allowing you to talk to your PC and ask it for help. Unfortunately, that step ahead is often followed by that cliché: two steps back. Copilot Vision for Windows is, at times, genuinely helpful. Outside of some nostalgic tears by former Microsoft CEO Steve Ballmer, the announcement of Copilot Vision for Windows was the highlight of Microsoft's 50th anniversary celebration at the company's Redmond, Washington campus.
Microsoft celebrates 50 years with major Copilot announcements and new features
Microsoft is celebrating its 50th anniversary, and the company is having some fun with it. The iconic Windows 95 logo was resurfaced, there is a themed version of Solitaire available, and Bill Gates even posted the source code for the company's first operating system, Altair Basic. Microsoft's Copilot is even getting some love. Actually, it would be more accurate to say that Microsoft has been showing Copilot a lot of love over the last few days. Announcements have been flying left and right, culminating in a livestream from Microsoft's global headquarters in Redmond, Washington, with even more information about current and upcoming Copilot features. Microsoft also had Copilot interview three Microsoft CEOs.
Learning Optimal Tax Design in Nonatomic Congestion Games Maryam Fazel Paul G. Allen School of Computer Science Department of Electrical Engineering
In multiplayer games, self-interested behavior among the players can harm the social welfare. Tax mechanisms are a common method to alleviate this issue and induce socially optimal behavior. In this work, we take the initial step of learning the optimal tax that can maximize social welfare with limited feedback in congestion games. We propose a new type of feedback named equilibrium feedback, where the tax designer can only observe the Nash equilibrium after deploying a tax plan. Existing algorithms are not applicable due to the exponentially large tax function space, nonexistence of the gradient, and nonconvexity of the objective. To tackle these challenges, we design a computationally efficient algorithm that leverages several novel components: (1) a piece-wise linear tax to approximate the optimal tax; (2) extra linear terms to guarantee a strongly convex potential function; (3) an efficient subroutine to find the exploratory tax that can provide critical information about the game.
Lever LM: Configuring In-Context Sequence to Lever Large Vision Language Models
As Archimedes famously said, "Give me a lever long enough and a fulcrum on which to place it, and I shall move the world", in this study, we propose to use a tiny Language Model (LM), e.g., a Transformer with 67M parameters, to lever much larger Vision-Language Models (LVLMs) with 9B parameters. Specifically, we use this tiny Lever-LM to configure effective in-context demonstration (ICD) sequences to improve the In-Context Learinng (ICL) performance of LVLMs. Previous studies show that diverse ICD configurations like the selection and ordering of the demonstrations heavily affect the ICL performance, highlighting the significance of configuring effective ICD sequences. Motivated by this and by re-considering the the process of configuring ICD sequence, we find this is a mirror process of human sentence composition and further assume that effective ICD configurations may contain internal statistical patterns that can be captured by Lever-LM. Then a dataset with effective ICD sequences is constructed to train Lever-LM. After training, given novel queries, new ICD sequences are configured by the trained Lever-LM to solve vision-language tasks through ICL. Experiments show that these ICD sequences can improve the ICL performance of two LVLMs compared with some strong baselines in Visual Question Answering and Image Captioning, validating that Lever-LM can really capture the statistical patterns for levering LVLMs.
Heterogeneous Bitwidth Binarization in Convolutional Neural Networks
Joshua Fromm, Shwetak Patel, Matthai Philipose
Recent work has shown that fast, compact low-bitwidth neural networks can be surprisingly accurate. These networks use homogeneous binarization: all parameters in each layer or (more commonly) the whole model have the same low bitwidth (e.g., 2 bits). However, modern hardware allows efficient designs where each arithmetic instruction can have a custom bitwidth, motivating heterogeneous binarization, where every parameter in the network may have a different bitwidth. In this paper, we show that it is feasible and useful to select bitwidths at the parameter granularity during training. For instance a heterogeneously quantized version of modern networks such as AlexNet and MobileNet, with the right mix of 1-, 2-and 3-bit parameters that average to just 1.4 bits can equal the accuracy of homogeneous 2-bit versions of these networks. Further, we provide analyses to show that the heterogeneously binarized systems yield FPGA-and ASIC-based implementations that are correspondingly more efficient in both circuit area and energy efficiency than their homogeneous counterparts.
Deep Submodular Peripteral Networks Arnav M. Das
Seemingly unrelated, learning a scaling from oracles offering graded pairwise preferences (GPC) is underexplored, despite a rich history in psychometrics. In this paper, we introduce deep submodular peripteral networks (DSPNs), a novel parametric family of submodular functions, and methods for their training using a GPC-based strategy to connect and then tackle both of the above challenges. We introduce newly devised GPC-style "peripteral" loss which leverages numerically graded relationships between pairs of objects (sets in our case). Unlike traditional contrastive learning, or RHLF preference ranking, our method utilizes graded comparisons, extracting more nuanced information than just binary-outcome comparisons, and contrasts sets of any size (not just two). We also define a novel suite of automatic sampling strategies for training, including active-learning inspired submodular feedback. We demonstrate DSPNs' efficacy in learning submodularity from a costly target submodular function and demonstrate its superiority both for experimental design and online streaming applications.